skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Liu, Eric"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. null (Ed.)
    In modern Machine Learning, model training is an iterative, experimental process that can consume enormous computation resources and developer time. To aid in that process, experienced model developers log and visualize program variables during training runs. Exhaustive logging of all variables is infeasible, so developers are left to choose between slowing down training via extensive conservative logging, or letting training run fast via minimalist optimistic logging that may omit key information. As a compromise, optimistic logging can be accompanied by program checkpoints; this allows developers to add log statements post-hoc, and "replay" desired log statements from checkpoint---a process we refer to as hindsight logging. Unfortunately, hindsight logging raises tricky problems in data management and software engineering. Done poorly, hindsight logging can waste resources and generate technical debt embodied in multiple variants of training code. In this paper, we present methodologies for efficient and effective logging practices for model training, with a focus on techniques for hindsight logging. Our goal is for experienced model developers to learn and adopt these practices. To make this easier, we provide an open-source suite of tools for Fast Low-Overhead Recovery (flor) that embodies our design across three tasks: (i) efficient background logging in Python, (ii) adaptive periodic checkpointing, and (iii) an instrumentation library that codifies hindsight logging for efficient and automatic record-replay of model-training. Model developers can use each flor tool separately as they see fit, or they can use flor in hands-free mode, entrusting it to instrument their code end-to-end for efficient record-replay. Our solutions leverage techniques from physiological transaction logs and recovery in database systems. Evaluations on modern ML benchmarks demonstrate that flor can produce fast checkpointing with small user-specifiable overheads (e.g. 7%), and still provide hindsight log replay times orders of magnitude faster than restarting training from scratch. 
    more » « less
  2. Chemically functional hydrogel microspheres hold significant potential in a range of applications including biosensing, drug delivery, and tissue engineering due to their high degree of flexibility in imparting a range of functions. In this work, we present a simple, efficient, and high-throughput capillary microfluidic approach for controlled fabrication of monodisperse and chemically functional hydrogel microspheres via formation of double emulsion drops with an ultra-thin oil shell as a sacrificial template. This method utilizes spontaneous dewetting of the oil phase upon polymerization and transfer into aqueous solution, resulting in poly(ethylene glycol) (PEG)-based microspheres containing primary amines (chitosan, CS) or carboxylates (acrylic acid, AA) for chemical functionality. Simple fluorescent labelling of the as-prepared microspheres shows the presence of abundant, uniformly distributed and readily tunable functional groups throughout the microspheres. Furthermore, we show the utility of chitosan's primary amine as an efficient conjugation handle at physiological pH due to its low pKa by direct comparison with other primary amines. We also report the utility of these microspheres in biomolecular conjugation using model fluorescent proteins, R-phycoerythrin (R-PE) and green fluorescent protein (GFPuv), via tetrazine– trans -cyclooctene (Tz–TCO) ligation for CS-PEG microspheres and carbodiimide chemistry for AA-PEG microspheres, respectively. The results show rapid coupling of R-PE with the microspheres' functional groups with minimal non-specific adsorption. In-depth protein conjugation kinetics studies with our microspheres highlight the differences in reaction and diffusion of R-PE with CS-PEG and AA-PEG microspheres. Finally, we demonstrate orthogonal one-pot protein conjugation of R-PE and GFPuv with CS-PEG and AA-PEG microspheres via simple size-based encoding. Combined, these results represent a significant advancement in the rapid and reliable fabrication of monodisperse and chemically functional hydrogel microspheres with tunable properties. 
    more » « less
  3. null (Ed.)